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Quantifying atmospheric aerosols and their linkages to climatic repercussions is necessary to understand the dynamics of climate 
forcing and enhance our knowledge of climate change. Because of this reactivity to precipitation, temperature, topography, and 
human activity, the atmospheric boundary layer (ABL) is one of the most dynamic atmospheric regions: ABL aerosols have a 
big impact on the evolution of climate change’s radiative forcing, human health, food security, and, eventually, the local and 
global economy. Continuous monitoring and instrumental and computational approaches are required for the detection and 
analysis of ABL pattern behavior. This paper provides a deep learning-based outer layer aerosol detection system based on 
Light Detection and Ranging (LIDAR) data fusion. The suggested method applies sequential models to turn low-level data into 
compressed features using object-based analysis, feature-level fusion, and autoencoder-based dimensionality reduction. 
Convolutional neural networks (CNNs) were used to convert compressed data into high-level properties that could be used to 
categorize air particles in the outer layer. This research describes deep learning approaches that allowed for detecting 40% 
more atmospheric features at a horizontal resolution of 5km during daytime operations when applied to LIDAR data. 
Compared to existing deep learning algorithms for edges and complicated near-surface sceneries during the day, a 
convolutional autoencoder (CAE) trained using LiDAR dataset standard data products showed the potential for improved 
aerosol discrimination with 98% accuracy. 


1. Introduction characteristics such as clouds and aerosols play a vital role 

[1]. Clouds of liquid water on the surface of Earth tend to 
In the Earth’s climate system, air quality and hydrological reflect inbound sunlight, cooling the surface of Earth. How- 
cycle with an extent that is mainly dependent upon the ever, ice clouds in the upper troposphere absorb and reradi- 
atmospheric properties height, thickness, and type and air ate heat emitted from the surface, warming up the surface of 


the Earth. Aerosol particles include windblown desert dust, 
wildfire smoke, sulfurous particles from volcanic eruptions, 
and fossil fuel particulate matter [2]. Aerosols cool or warm 
the surface, depending on their size, composition, and loca- 
tion in the atmosphere [3]. 

A CNN is a supervised machine learning method used 
for the recognition of picture features. Commercial uses of 
CNNs include a wide variety of object detection and seman- 
ticized issue segmentation [3]. CNNs have also been used to 
predict tropical cyclone intensity precisely using satellite 
imagery and hailstorm detection in radar images with higher 
accuracy than previous techniques. After instantiation of 
CNN’s layer architecture, the model is trained with truth 
data sets, which develops expertise to predict proper charac- 
teristics in the image [4]. While the training of a CNN can 
take a long time, the forecasts are rapid compared to older 
algorithms or a manual approach. A collection of CNNs 
has been built to forecast the positions of clouds and aerosols 
in CATS LiDAR data to increase the speed at which LiDAR 
data may be distributed and establish feasibility to give real- 
life time layer type products [5]. 

Natural and artificial aerosol emissions can significantly 
threaten urban and regional air quality, such as biomass 
burning. As a result, it is crucial to understand the optical, 
microphysical, and geometrical characteristics of local or 
targeted aerosol emissions in the boundary layer. The 
LiDAR sensor, which uses a laser as its source, may offer 
highly temporal and spatially vertically resolved profiles of 
aerosols [6]. As a result, LiDAR remote sensing observations 
will aid in the research and characterization of aerosol emis- 
sions from source to destination and improve air quality. 
This section welcomes submissions on the most recent 
results and advancements in LiDAR distant detection of 
optical, microphysical, and mathematical spray properties 
from airborne-mounted LiDARs, territorial ground-based 
LiDAR organizations, global satellite missions, across all 
instrument platforms (Raman, high-spectral resolution, 
DIAL, and others), fleeting and spatial scales, and from 
airborne-moon missions. LiDAR control through a remote 
and the identification of anthropogenic aerosols that have 
an impact on air quality from industrial, biomass burning, 
and agricultural sources, as well as campaigns targeted at 
giving a full assessment of climate and health consequences, 
are especially encouraged [7]. Man-made aerosol emissions 
in cities are linked to their impacts on micrometeorology 
and the radiative budget, i.e., their function in heating/cool- 
ing the atmospheric column and promoting/suppressing 
convection, and are given specific emphasis [8]. We have 
used convolutional autoencoder models (CAE) [7] to detect 
aerosols from fusion LiDAR dataset [8]. CAE’s ability to 
extract the aerosol type is influenced by the optical inputs’ 
physical substance and uncertainty, as well as the CAE struc- 
ture and training technique, notably the size of the data set 
employed for this purpose. An aerosol model detailing the 
optical characteristics of distinct particles was created to 
provide a consistent depiction of the aerosol types. This 
model can provide a representative and statistically mean- 
ingful synthetic database in order to recreate known aerosol 
features. This synthetic data collection is important since few 
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observational data sets are statistically relevant, well-charac- 
terized, and representative of the whole range of aerosol spe- 
cies. Normalization is a common practice in data 
preparation for machine learning. You must normalize your 
data to a standard scale without distorting the range of num- 
bers or surrendering any information if you want it to be 
consistent [9]. The aerosol model was constructed in order 
to train the CAE by simulating a large number of LiDAR 
observations (i.e., a synthetic data set) [10]. The most likely 
aerosol type inside the detected layers is the output data 
from generative adversarial neural networks (GANs) [10]. 

Deep learning techniques consist of deep layers where 
feature extraction and classification are not separately per- 
formed. Deep learning is a subclass of machine learning that 
processes data and makes patterns for use in decision- 
making. The deep learning technique teaches the machine 
to perform intelligent tasks. Deep learning contains numer- 
ous techniques such as a CNN, CAE, and GAN model [11]. 
The CNN works automatically for detecting features and 
classifying the raw dataset. Deep learning is a more advanced 
technique for recognizing hidden features more accurately 
and efficiently [12]. 

The detection of aerosols with manual detection is a 
challenging task due to the various properties of the environ- 
ment. Hence, an automated and accurate system is required 
for aerosol emission detection [13]. The research is aimed at 
designing an intelligent aerosol emission detection system 
using fusion LiDAR data and applying deep and machine 
learning techniques to predict the emissions’ area [14]. 

The aerosol emissions recognition has been considered 
as an essential application for numerous security branches 
and health systems. Several researchers have applied hand- 
crafted techniques to identify such anomalies in the scene 
[15]. Using handcrafted features from linear binary configu- 
ration from three orthogonal planes, Gaussian mixture 
model, Markov random field, etc. for irregularity recognition 
is not acceptable as they solely rely on human assumption. 
Hence, the training data is not explained correctly to learn 
discriminative features characteristic of aerosols [16]. The 
acquired data from remote sensing and satellites are the 
key characteristics and contributions of this research. We 
used the convolutional autoencoder (CAE) neural network 
to process the data, which takes photos and extracts the hid- 
den patterns of the input images before reconstructing the 
features from the hidden pattern. We then established a 
sequential model in the autoencoder model, which allows 
us to simply build sequential layers of the network from 
input to output. Then, we have applied GAN, which helps 
to solve such tasks as pattern recognition from descriptions, 
getting high resolution of images from low-resolution ones 
and predicting which is the aerosol emission area or not [1]. 

All of this research [17] have revealed a diverse group of 
people. A wide range of aerosols is challenging to categorize 
due to many flaws (e.g., many aerosol types have identical 
optical characters). Other challenge in the categorization of 
aerosols is the difficulty in linking their optical qualities to 
their physical properties source [18]. In actuality, atmo- 
spheric aerosols are made up of a variety of substances. 
There are a lot of sources, and data on pure aerosol kinds 
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TABLE 1: Comparative analysis of previous studies conducted on LiDAR. 


Ref Technique 
Xiu et al. [1] 
Zhang et al. [2] 


Semantic segmentation 


Computer vision techniques 


Melotti and Premebida [5] Multimodal deep learning 


Zhang et al. [7] Multisource data fusion 


Object exploration vision 


Wahid et al. [6] techniques 


3 
Source of data Outcome Accuracy 
LiDAR dataset The entire waveform of LiDAR 89% 
Land SAT8 data Retrieval of Forest aboveground biomass 90.3% 
Self-created ‘ ts ue ð 
dataset Object recognition combining camera 93% 
LiDAR dataset Data fusion 88% 
Self-created Distributed soft actor critics 86.45% 


dataset 





are scarce. Systematic measurements and intense measure- 
ment campaigns employing various aerosol measurement 
methods have been carried out to address these difficulties. 

Many Earth systems, such as temperature, air quality, 
and hydrology, are affected by the atmospheric properties 
of clouds and aerosols, and their effects are highly influenced 
by their height, thickness, and kind. At ground level, liquid 
water clouds tend to reflect incoming sunlight, which helps 
chill the surface [18]. Additionally, clouds in the upper tro- 
posphere that comprise ice are capable of absorbing heat 
from the surface and reemitting it, therefore contributing 
to the rise in surface temperature [19]. Dust, smoke, sulfur, 
and particles of fossil fuel burning are all examples of aerosol 
particles. The ability of aerosols to cool or heat the surface 
depends on their size, composition, and position in the 
atmosphere [20]. Many types of aerosols, including dark- 
colored ones, such as black carbon from fossil fuel combus- 
tion, are known to absorb radiation. Table 1 shows the pre- 
vious studies’ comparative analysis. 

For this reason, the current study presents and contrib- 
utes as follows: 


(i) The use of LIDAR and orthophotofusion combined 
with a deep learning (DL) strategy to detect aerosols 


(ii) DL has progressed past multilevel perceptron and 
now includes the following 


This particular study employs an autoencoder frame- 
work and a convolutional neural network (CNN) to accom- 
plish feature dimensionality reduction and object 
classification of aerosols and no aerosols in LiDAR and 
orthoimage data after segmentation. 


2. Materials and Methods 


An aerosol model was used to investigate the optical charac- 
teristics of pure aerosols produced by a single source (e.g., 
dust produced by the deserts and marine particles produced 
by the oceans). Continental, continental polluted, dust, 
marine, smoke, and volcanic are the six forms of pure aero- 
sols addressed in this article. The aerosol model combines 
the global aerosol dataset with iterative computations of 
each aerosol type’s intensive optical properties, as well as a 
numerical technique for T-matrix. The OPAC software 
application was used to determine the chemical makeup of 
each pure aerosol type (aerosol and cloud optical proper- 


ties). To replicate the vast spectrum of particles in the atmo- 
sphere, the chemical composition of each aerosol type was 
modified within specified boundaries. For sound wave- 
lengths of 350, 550, and 1000 nm, the aerosol model was uti- 
lized to create a synthetic database. These wavelengths were 
selected from OPAC’s 61 wavelengths (0.25-40 m) for which 
GADS possesses microphysical aerosol parameters. After 
that, the wavelengths are rescaled in angstroms to match 
the traditional LIDAR wavelengths (i.e. 355, 532, and 
1064 nm). This was deemed to be an acceptable assumption 
for all aerosol types, given the minimal difference between 
the LiDAR and model wavelengths. The aerosol model can 
be expanded to cover more wavelengths if necessary. 

Every type of pure aerosol is made up of an internal 
combination of fundamental components in variable mix 
ratios that do not interact physically or chemically. Water- 
soluble, insoluble, soot, mineral, sulfate, and sea salt are all 
collected by OPAC (accumulation, coarse). The microphys- 
ical properties of each component are stored in the GADS 
database. Smoke and continentally contaminated kinds, 
however, cannot attain values above 1.2 for angstrom (550 
to 350nm) with the present GADS soot refractive index 
values. 

Figure 1 depicts the workflow of the suggested technique 
combining a convolutional autoencoder, a sequential algo- 
rithm, and a GAN. The following are the details of the block 
diagram. 


2.1. Data Acquisition of UAV LiDAR Datasets. The input 
photos are from the remote sensing LiDAR data of the aero- 
sol emission dataset, 128 x 128 x 3 input form. The convolu- 
tional autoencoder uses this image as an input (CAE). To 
recover the hidden patterns, CAE separates the input image 
into convolutional and pooling layers. It will then be sup- 
plied into the deconvolutional and unspooling layers, which 
will reconstruct the features of the hidden pattern. We used 
a sequential approach, which made it simple to build subse- 
quent network layers in the order of input to output. Then, 
we employed a GAN to tackle problems like picture genera- 
tion from descriptions or features, converting low-resolution 
image frames to high-resolution image frames, detecting 
which emission activity is active or not, and recovering 
image frames containing a given pattern. 

One of the most well-known datasets in the field of aero- 
sol detection is the LiDAR fusion dataset. It contains data 
from an aerial view, LiDAR, and other sensors attached to 
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FIGURE 1: Proposed methodology. 


the top of a drone that flies through various environments 
and scenarios. 

This collection contains LiDAR frames that have been 
converted to 2D depth images. These 2D depth images show 
the same scene as the corresponding LiDAR frame but are 
more user-friendly. 

The 360 LiDAR frames, like those in the dataset, are 
arranged in a cylinder around the sensor. The 2D depth 
images in this dataset might be represented as if the cylinder 
of the LiDAR frame had been split in half and straightened 


into a 2D plane. The distance of the reflecting item from 
the LiDAR sensor is represented by the pixels in these 2D 
depth photographs. The number of laser beams utilized to 
scan the surroundings is represented by the vertical resolu- 
tion of the 2D depth image (64 in our case). These 2D depth 
images could be utilized for segmentation, detection, recog- 
nition, and other tasks, drawing on a large body of computer 
vision literature on 2D images. We have compared our 
model with a hybrid model of GAN and autoencoder to 
compare the performances. 
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FIGURE 2: Generative adversarial network (GAN). 


2.2. Model Training. The proposed technique defined the 
rule when emissions occurred. We trained the model, which 
contains spatial feature descriptors. The image description 
explains the visual feature of each frame. Each frame has 


its characteristics such as shape, color, andtexture. This 
description provides a feature vector. The convolutional 
autoencoder model is adequately trained with blocks of 
pixels that contain only standard segments. The frames’ 


input and output volume mistakes are reduced. The model is 
trained correctly on the regular images, and then the model 
shows the low reconstruction error. Each testing input image 
produces a reconstruction error. The reconstruction error 
depends upon custom loss. We set the threshold on the 
value. If the value crossed a threshold limit, it shows an aero- 
sol emission and represents a regular event below the thresh- 
old limit. Thus, the system will be able to recognize the rare 
events that occur in the images. 


2.3. Model Parameters. The training model is used for reduc- 
ing the reconstruction error of the input volume. The pro- 
posed Model used an Adam optimizer, and the learning 
rate automatically depends upon the updated history of the 
model’s weight. The minimum patch size is 64. Every train- 
ing image size is trained for a maximum of 50 epochs, or 
until the aerosol layers are lost and the 10 consecutive 
epochs are reduced. The spatial autoencoder activation goal 
is chosen to be the hyperbolic curve. Despite its regulariza- 
tion capacity, we did not use the rectified linear unit (ReLU) 
to guarantee the regularity of the encoding and decoding 
functions because triggered values from ReLU have no upper 
bound. 


2.4. Convolutional Autoencoder Model. An autoencoder is an 
encoder-decoder system that reconstructs the input as the 
output. We achieved autoencoder by two subsystems: the 
encoder converts the input image frame into a feature vector 
for internal representation [6]. The decoder, on the other 
hand, translates the internal representation back to the orig- 
inal reconstructed image. Autoencoder provides a recon- 
struction error [19]. The minimum reconstruction error 
means a slight difference between the input and the recon- 
structed image frames [20]. 


2.5. Sequential Model. The sequential model was used, which 
makes it simple to stack sequential network layers from 
input to output. Figure 2 shows generative adversarial net- 
work (GAN). 


2.6. Generative Adversarial Network Model. GAN is a gener- 
ative modelling method based on the CNN method. In 
machine learning, generative modeling is an unsupervised 
learning problem [21]. It comprises pitting two neural net- 
works against each other to automatically find and learn reg- 
ularities or patterns in incoming data. Adversarial 
competition consists of two parts: generator: replicate 
authentic data in order to create fictitious data and discrim- 
inator: detecting the generator by distinguishing between 
accurate and fictitious data [12]. 

As a result, we used GAN to perform tasks such as image 
generation from descriptions or features, obtaining high res- 
olution image frames from low resolution ones, predicting 
which emission activity is active and which is not and 
retrieving image frames containing a given pattern. 
Figure 3 shows classification using the generative adversarial 
model. 


2.7. Model Description. As shown in Figure 1, the UAV fused 
LiDAR dataset utilized in this investigation was collected 
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above the Universiti Sains Malaysia campus on February 3, 
2018, at midday. A Canon PowerShot SX230 HS (5mm) 
camera was used to collect data from a UAV flying at a 
height of 353 meters (5mm). The photographs were created 
using three channels (RGB) with a ground resolution of 
around 9.95 cm/pixel, a resolution of 4000 3000-pixels, and 
an 8-bit radiometric resolution. An orthomosaic snapshot 
of the collected image series was produced with an average 
root mean square errors (RMSE) of 0.192894m. (1.08795 
pix). The DSM was also created with Agisoft PhotoScan Pro- 
fessional. The chosen subset spans a total area of 1.68 km”. 
The DSM’s resolution was 79.6cm/pixel, while Agisoft’s 
point clouds had a point density of about 1.58 points/m”. 
Figure 1 depicts the operational flow of the suggested tech- 
nique using a convolutional autoencoder, a sequential algo- 
rithm, and a generative adversarial network (GAN). The 
following are the details of the block diagram: 

The input photos come from the UAV aerosol real- 
world aerosol collection and have a 128x128x3 input shape. 
The convolutional aAutoencoder uses this image as an input 
(CAE). CAE extracts latent patterns from input pictures 
using convolutional and pooling layers (128x128x3). It will 
then be fed into the deconvolutional and max-pooling 
layers, which will recreate the hidden pattern’s characteris- 
tics. We used the sequential model, which allows us to stack 
sequential network layers from input to output effortlessly. 
Then, we used GAN to help with tasks like picture genera- 
tion from descriptions or features, obtaining high resolution 
image frames from low-resolution ones, predicting whether 
aberrant activity is there or not, and retrieving image frames 
containing a given pattern. 

Feature descriptors output feature descriptors/feature 
vectors from an input image frame. Feature descriptors are 
a set of integers that encode useful information. To validate 
the results, the UAV data was divided into two categories: 
testing (20%) and training (80%). Convolutional autoenco- 
der and GAN model are two deep learning algorithms. The 
purpose of each model is to generate reconstructed images 
in a hybrid way by using an output layer from previous 
models. The sequential model has been used in CAE for 
sequencing the stack layers. 


2.8. Raw Image Data Processing. The UAV aerosol dataset is 
used for testing and evaluation of the offered method. The 
aerosol dataset contains 13 different real-world anomalies. 
The real-world anomalies are abused, arrest, assault, and 
explosion, etc. We know that images are combinations of 
frames; so, we have converted the images into frames for 
preprocessing and feature extraction. The converted image 
frames in the form of JPEG and applied image resizing are 
as follows: the image resizing is essential because the dimen- 
sion of each image’s frame is not the same. The resized 
images are fed into the temporal volumes. 


2.9. Model Training. The proposed technique defined the 
rule when abnormal events occurred. The maximum regular 
frames are different as compared to the abnormally frames. 
We trained the model, which contains spatial feature 
descriptors. The image description explains the visual feature 
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Ficure 3: Classification using the generative adversarial model. 


of each image’s frame. Each frame has its characteristics such 
as shape, color, texture, and motion. This description provides 
a feature vector. The convolutional autoencoder model is ade- 
quately trained with images blocks contains only regular seg- 
ments. The error between the input and output volume of 
the frames is reduced. The model is trained correctly on the 
standard image’s frames, and then the model shows the low 
reconstruction error. Each testing input images volume pro- 
duces a reconstruction error. The reconstruction error 
depends upon custom loss. We set the threshold on the value. 
If the value crossed a threshold limit, it shows an abnormal 
event, and below the threshold limit, it represents a typical 
event. Thus, the system will be able to recognize the rare events 
that occur in the images. In the following Table 2, we have 
shown the features that are extracted in the model. 

One can find a wide variety of information about surface 
elements like topography, texture, and shape by studying 
them on images and LiDAR surveys. Using a lot of different 
characteristics might lead to overfitting, and that it is espe- 
cially true when the training set is quite small. The other 
downsides of using several characteristics are that they 
increase the level of noise, the volume of redundant informa- 
tion, and the time it takes to compute. To deal with the prob- 
lem of high-dimensional feature space, an autoencoder-based 
technique is proposed that reduces feature space dimensional- 
ity and improves low-level features by translating them into 
fewer features (i.e., reduced low-level features). The redesigned 
features will most likely be more informative than the initial 
raw features, assisting the full detection model creation pro- 


cess. CNN models were built to identify key architectural attri- 
butes, which were then processed using a series of convolution 
and pooling procedures to convert low-level characteristics 
into high-level ones. This section discusses the process after 
using autoencoders and CNN models to abstract low-level 
properties. 


2.10. Model Parameters. The training model is used for 
reducing the reconstruction error of the input volume. The 
proposed model used an Adam optimizer; the learning rate 
automatically depends upon the updated history of the 
model’s weight. The minimum patch size is 64. Depending 
on the aerosol layers, each training image size is trained for 
a maximum of 50 epochs. Following the loss of authentica- 
tion data, the 10 consecutive epochs are no longer reduced. 
The spatial autoencoder activation goal is chosen to be the 
hyperbolic curve. Despite its regularization capacity, we did 
not use the rectified linear unit (ReLU) to guarantee the reg- 
ularity of the encoding and decoding functions because trig- 
gered values from ReLU have no upper bound. An 
autoencoder is an encoder-decoder system that reconstructs 
the input as the output. We achieved autoencoder by two 
subsystems: the encoder converts the input image frame into 
a feature vector for internal representation. 

The decoder, on the other hand, uses the internal repre- 
sentation to translate it back to the reconstructed images. 
Autoencoder provides a reconstruction error. The minimum 
reconstruction error means a slight difference between the 
input images frame and the reconstructed image frame. 
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TABLE 2: Features extraction from dataset. 


8 

Data Feature 
Orthophoto Spectral 
Orthophoto Texture 
LiDAR Shape 
LiDAR LiDAR 


2.11. Sequential Model. The sequential model was employed 
in Figure 4, which allows us to easily stack sequential net- 
work layers from input to output. 


2.12. Generative Adversarial Network (GAN). GAN is a gen- 
erative modeling method based on the CNN method. In 
machine learning, generative modeling is an unsupervised 
learning problem. It comprises pitting two neural networks 
against each other to find and learn regularities or patterns 
in incoming data. As a result, we used a generative adversar- 
ial network (GAN) to solve tasks like image generation from 
descriptions or features, obtaining high resolution image 
frames from low-resolution ones, predicting whether abnor- 
mal activity is abnormal or not, and retrieving image frames 
that contain a given pattern. 


3. Results and Discussion 


The neural complexity addresses the lesser limits of neural 
resources (neuronal counts) a network needs to do a specific 
task within a certain tolerance. Lower limits on the informa- 
tion required for the intended input-output function are 
measured by the complexity of the information (i-e., number 
of examples). This study suggests a superresolution convolu- 
tional neural network (CNN) with a minimal level of com- 
plexity (SR). The computational complexity of the 
suggested strategy is 71.37 percent lower in CPU, TPU, 
and GPU than the very-deep SR (VDSR) technique, with a 
peak signal-to-noise ratio loss of 0.49 dB. 

Autoencoder, a generative model, was used in the pro- 
posed model. Image samples are used to train the autoenco- 
der, and testing images are used to predict the aerosol. An 
encoder and a decoder make up the autoencoder. For the 
reconstructed pictures, the trained model’s loss function is 


GLCM Description 
N-1 
y P2ij 
i,j=0 
Angular N-1 
Contrast È Pij(i- j) 
Correlation S aN 
J Pij(G-u)-G-u) 
i,j 
N-1 
È Pin (ij) 
i,j=0 
Entropy N-1 j 
Homogeneity = 
Mean - ijo l+ (Pij) 
di Z SOPH) = Ņ Z SOPH G) 
i,j= i,j= 
Area Area of segments 
Compactness Compactness of polygon 
Density Density of holes 
DEM Digital elevation model 
DSM Digital surface model 
nDSM Object high (DEM-DSM) 


calculated. At the feature extraction stage, as shown in 
Figure 5, A total of 21 features, including spectral, form, tex- 
tural, and LiDAR-based attributes, were retrieved to recog- 
nize aerosol layers objects in LiDAR and orthophoto data. 
Spectral features were used to evaluate the mean pixel values 
in the orthophoto bands. Shape attributes are the geometric 
information of meaningful things that is determined from 
the pixels that make up these objects. To make sure that 
these features are used effectively, the map must be seg- 
mented accurately. Haralick texture characteristics were also 
used to construct texture features based on the grey-level co- 
occurrence matrix (GLCM) or the grey-level difference vec- 
tor. Alternatively, the topography and height of objects were 
described using LiDAR-based characteristics. The identifica- 
tion and description of aerosol layers are critical elements in 
the reconstruction of aerosol layers objects. The preceding 
alluded to a method for distinguishing aerosol layers items 
among various objects [20]. The last, on the other hand, is 
concerned with defining the mathematical limit of aerosol 
layer objects so that their computation and concentration 
information can be displayed as attributes connected with 
the objects in a geographic information framework (GIS). 
From one perspective, orthophoto has a critical spatial objec- 
tive limit and exhibits solid reflectance around the limits of 
aerosol layers. In any event, the uncanny similarity of distinct 
ground objects complicates orthophoto extraction of aerosol 
layers. However, because of the relatively tiny footprint size 
of the laser bar and unfavorable backscattering from lighted 
targets, collecting aerosol layers edges with tallness discontinu- 
ities is difficult in LiDAR [20]. The use of orthophoto and 
LiDAR together can increase the precision of aerosol layer 
detection and description measurements in this way. 
Information combination is the process of using or com- 
bining data from multiple sources to frame a new dataset 
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FIGURE 4: Sequential model. 


and achieve a certain aim. [21]. Pixel, highlight, and choice 
combinations are the three layers of combination that can 
combine information from numerous sources. Because aero- 
sol layer identification and description using object-based 


inquiry are more basic and proficient, the current research 
receives the component level. Low-level highlights for aero- 
sol layer detection are framed using orthophoto highlights 
(e.g, phantom and textural highlights) and LiDAR 
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Figure 5: Experimental approach. 


highlights (e.g., DSM, DEM, nDSM, and spatial highlights) 
(Table 1). 

Many of the features associated with ghastly, textural, 
geological, and shape collections can be separated from 
orthophoto and LiDAR data. Overfitting can occur when 
several highlights are used, especially when the training tests 
are minimal. Commotion, extra information, and more 
computer time are some of the drawbacks of employing a 
large number of highlights. The current study uses an 
autoencoder-based technology to minimize space dimen- 
sionality and improve low-level highlights by reducing them 
into fewer highlights (i.e., diminished low-level highlights). 
The new highlights should be more informative than the 
old ones, and they should improve the overall system work 
process for recognizing aerosol layers. A CNN model is also 
evolved by executing several convolution and pooling 
actions to choose the right highlights for identifying aerosol 
layers and to turn lowered low-level highlights into signifi- 
cant level highlights. The autoencoder and CNN models 
are used to decrease (or abstract) low-level highlights in 
the next sections. 

The model has been properly trained when the recon- 
struction error is modest. The model was not sufficiently 
trained if the inaccuracy was significant. Testing photos are 
used to evaluate the model after it has been trained. All of 
the layers in the autoencoder with the dense layer are fully 
connected. The information is passed through the bottleneck 
layer between the encoder and the decoder. We only use one 
frame at a time in a simple autoencoder. Figure depicts the 
autoencoder visual layer structure and properties. Each con- 
volution layer has a 3 x 3 filter size with 128, 192, and 256 
filters. The convolution’s filtering processes are combined 
in the max-pooling layer, which is 3 x 3. The image volume 
is normalized using the normalization layers. The activation 
function is performed using RLU layers. The number of 
aerosol frames is detected on the softmax layer using the loss 
function. For training, the loss function is utilized. The sig- 
moid response value varies between 0.5 and 0.7. Figure 6 
shows the structure of the max pooling layer. 


Figure 6 shows an autoencoder that uses layers to trans- 
late the input image frame into a feature vector for internal 
representation (batch normalization, ReLU activation func- 
tion, and Conv3D). The internal representation is used by 
the decoder. In the third column, it reverts to the original 
reconstructed picture frames, and in the second column, it 
expresses the shape in vector form. 


3.1. Model Sequential. The sequential model is made by 
applying a 3D convolutional neural network, varying the 
number of filters in convolutional layers. It will make it suit- 
able for a basic stack of layers where each layer has accu- 
rately one input tensor and one output tensor. It will 
create its weights the first time it is called on an input image 
since the shape of the weights depends on the shape of the 
image frames. Before completing training, a model config- 
ures the learning process, which is done via the compile 
function. It receives three arguments optimizer, loss func- 
tion, and a list of metrics. 

An optimizer should be the string identifier or call to an 
optimizer function. In the sequential model, the main aim is 
to minimize the loss function. It is a string identifier to call a 
loss function, e.g., a loss means squared error. 

The output of the fully connected layer CNN is a soft- 
max, and the sigmoid function is used for combining the 
result of each layer in the sequential model that is shown 
in Figure 7. For classification, a median filter of size three 
is applied to the output conclusion to smooth out variations 
in the classification of anomalies. 

Figure 7 above shows the creation of a sequential model 
by applying a 3D convolutional network, changing the num- 
ber of filters in convolutional layers. It will make a suitable 
plan for input to output weights of the shape depend upon 
the image frame. 


3.2. Generative Adversarial Network (GAN). The GAN 
model has been used for the reconstruction of the image 
with HD resolution. Here, we use the model. 
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Ficure 6: The structure of the max pooling layer. 
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Figure 7: The internal structure of the sequential model. 


Following are the parameters on which basis we have to 





evaluate our work. To validate the suggested technique, per- Accuracy = (TP + TN) f 
formance measures such as accuracy, sensitivity, specificity, (TP + TN + FP + FN) 
and AUC are determined. The following are the perfor- 
mance parameters of the suggested technique: Senisitivity= oe 
ificity = TN shes ina 
Specificity = (Nem) aca Sensitivity + Specificity 
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Ficure 8: Training of data. 
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Ficure 9: Model accuracy and loss. 
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(i) False-negative (FN): the feature detected result is 0, 
and predictive powers are present 


(ii) True-negative (TN): the feature detected result is 0, 
and predictive power is absent 


(iii) False-positive (FP): the feature detected result is 1, 
and predictive power is absent 


(iv) True-positive (TP): the feature detected result is 1, 
and predictive power is present 


3.3. Training. In the first step, we have trained our model on 
70% data the training loss 0.00186344. Figure 8 shows train- 
ing of data. 


3.4. Testing. The testing of data on the GAN model has been 
carried out at 30% of set. Model accuracy to detect aerosols 
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FIGURE 11: Aerosol detection using the GAN model. 


is almost 98% at 140 epochs while at 60 epochs, it is 97%. 
Figure 9 shows model accuracy and loss. 

Figure 10 above shows the training and validation accu- 
racy and loss of the CAE model in classifying the images of 
aerosols from the datasets. The model has shown 98% accu- 
racy of training and 98.7% accuracy of validation during 
experiments. Figure 11 shows aerosol detection using the 
GAN model. 

Figure 12 shows aerosol outlier detection using at differ- 
ent time spans. While comparing with the GAN model on 
the right side at 60 epochs, CAE has shown the accuracy of 
98% on training and 99% on testing with the inclusion of 


the GAN model. Accuracy curves for training and validation 
have no dropout when combining DSM and RGB (left) and 
loss of information when using RGB alone (right), both 
without dropout. Thus, the parameters were examined and 
hypertuned to maximize detection accuracy. According to 
the findings of the sensitivity analysis of these parameters, 
the best 10-fold crossvalidation accuracy for aerosol detec- 
tion was achieved for the area in which the tests were done. 
The study concluded that 128 filters delivered an accuracy of 
98.76% percent. The poorest results (15.5% accuracy) were 
seen when using 64 filters. Adam also delivers an accuracy 
of 81.41%, which is much superior to other optimization 
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Figure 12: Aerosol outlier detection using at different time spans. 


TABLE 3: Comparisons of current study with previous studies. 


References 
[2] 
[1] 
[5] 


Techniques 
CNN 
Mark CNN 
Mod CNN 
CAE with GAN 
ADAM 
AdaGrad 
AdaDelta 
SGD 
RMSProp 


Our proposed 


techniques. Conversely, the dense layer has ignored con- 
cealed units with no consequence. 

In testing, the most accurate (98.76%) results were pro- 
duced using 10 or 100 units. The results for the 50- and 3- 


Accuracy Outcome 

89% Object recognition 
88% Waveform recognition 

88.5% Biomass land recognition 
98% 

81.41% 

23.57% E 

Outlier aerosol recognition 

38.95% 

78.41% 

69.55% 


unit units were slightly less accurate (81.61 percent). The 
application of a smaller number of units in the fully linked 
layer is beneficial to the computational performance of the 
model; hence, it is considered best to utilize 10. 
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We have also compared our technique with state-of-the- 
art algorithms, as shown below in Table 3. 


4. Conclusions 


The researchers employed autoencoders and CNN models to 
detect aerosols in a LiDAR-orthophoto dataset, resulting in 
a DL method. The architecture is designed to generate 
objects using multiresolution and spectral difference 
segmentations. The identification of 9 distinct features, 
including spectral, textural, LIDAR, and orthofusion, was 
completed for the detection of aerosols. Next, they were 
compressed into 10 features at the feature level, using the 
autoencoder model. To categorize the items, they employed 
the high-level features generated from the modified com- 
pressed features. Building detection using this design has 
many advantages, including automated feature selection 
and removal of redundant characteristics. Convolutional 
neural networks (CNNs) were utilized to convert com- 
pressed information into high-level characteristics that could 
categorize the outer layer of atmospheric particles. This 
research describes deep learning approaches that, when 
applied to Lidar data, allowed for detecting 40% more atmo- 
spheric features at a horizontal resolution of 15 km during 
daytime operations. In comparison to existing deep learning 
algorithms for edges and complicated near-surface sceneries 
during the day, a convolutional autoencoder (CAE) trained 
using LiDAR Dataset standard data products showed the 
potential for improved aerosol discrimination. However, 
the dataset including height information (the fused ortho- 
mosaic photo and DSM) performed better in most discrim- 
inative classifications. This study demonstrated CAE’s 
capacity to accurately categorize lower-resolution UAV- 
fused LiDAR images in comparison to very-high-resolution 
aerial shots and also indicated that dataset fusion is promis- 
ing. The model has shown 98% accuracy of training and 
98.7% accuracy of validation during experiments. While 
comparing with the GAN model on the right side at 60 
epochs, CAE has shown the accuracy of 98% on training 
and 99% on testing with the inclusion of the GAN model. 
The sensitivity of CNNs with various fusion methods to 
the training dataset, regularization functions, and optimizers 
will be the subject of future research. 
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